Testing Architecture and Methodology¶

The reports on this page were prepared to analyze the performance & scalability of the GameBeam framework in real world conditions.

Test Pipeline¶

To collect the metrics for this report, we have created a tool that can automate the execution of multiple test configurations through batch processing. The pipeline consists of several key components:

  1. Configuration Setup: Tests are defined using customizable parameters including resolution, frame rate, client count, audio settings, hardware acceleration, and test duration.
  2. Orchestration: The batch testing module sequentially executes each configuration, optionally repeating each test multiple times to ensure statistical validity.
  3. Data Collection: During execution, the framework collects comprehensive performance metrics from both the server and clients.
  4. Storage & Analysis: All metrics are stored in a PostgreSQL database.

Configurations¶

Both the client and host computers are synchronized to UTC time with sub-millisecond precision to accurately calculate end-to-end latency. This was accomplished by incorporating a GPS-based stratum 1 server on the LAN with the host, while the clients were synchronized using Amazon Time Sync, which supports clock synchronization within microseconds.

Tests were conducted with the following host hardware configuration:

  • CPU: AMD Ryzen 7 7800X3D
  • RAM: 64 GB DDR5
  • GPU: Nidia RTX 4090
  • OS: Windows 11 Pro (24H2) 64-bit
  • WAN: Verizone FiOS 1 Gbps symmetric

Clients were virtualized cloud computing instances with the following configuration:

  • Provider: Amazon AWS
  • Region: us-east-1 (Virginia)
  • CPU: 2 cores of Haswell E5-2676 v3
  • RAM: 4GB
  • OS: Linux
  • WAN: Undisclosed by Amazon

The test suite includes various configurations to thoroughly evaluate performance across different scenarios, where each combination was tested for a duration of 60 seconds:

  • Client Load Testing: From 0 to 3 simultaneous clients connecting to the host
  • Resolution Scaling: Testing at multiple resolutions (720p, 1080p, 1440p)
  • Frame Rate Variants: Testing at both 30 FPS and 60 FPS targets
  • Feature Toggles: Tests with and without audio streaming
  • Encoding Options: Comparison between hardware and software encoding
  • Statistical Validity: Each configuration can be executed multiple times (typically 6 runs per configuration) to ensure reliable results

Metrics Collected¶

Host Performance Metrics¶

Captures host side performance metrics using typeperf and nvidia-smi tools.

  • timestamp: The time when the metric was recorded.
  • gpu_utilization: GPU usage percentage, measured via typeperf.
  • working_set_private: Amount of private working set memory in use, measured via typeperf.
  • cpu_usage: CPU usage percentage, measured via typeperf.
  • bytes_received: Total number of bytes received, measured via typeperf.
  • bytes_sent: Total number of bytes sent, measured via typeperf.
  • packets_received: Total number of network packets received, measured via typeperf.
  • packets_sent: Total number of network packets sent, measured via typeperf.
  • nv_gpu_power: GPU power consumption (in watts), measured using nvidia-smi.
  • nv_gpu_temp: GPU temperature, measured using nvidia-smi.
  • nv_gpu_mem_temp: Temperature of the GPU memory, measured using nvidia-smi.
  • nv_gpu_sm: Streaming multiprocessor utilization of the GPU, measured using nvidia-smi.
  • nv_gpu_mem: GPU memory utilization, measured using nvidia-smi.
  • nv_gpu_enc: GPU encoder utilization, measured using nvidia-smi.
  • nv_gpu_dec: GPU decoder utilization, measured using nvidia-smi.
  • nv_gpu_jpg: GPU JPEG processing utilization, measured using nvidia-smi.
  • nv_gpu_ofa: GPU OFA metric (specific functionality metric), measured using nvidia-smi.
  • nv_gpu_mem_clock: GPU memory clock speed, measured using nvidia-smi.
  • nv_gpu_clock: GPU core clock speed, measured using nvidia-smi.

WebRTC Candidate Pair Performance Metrics¶

Captures unifed webrtc connection metrics, that's not specific to a single video, audio or data channel connection.

  • timestamp: The time when the metric was recorded.
  • transport_id: Identifier for the transport channel used.
  • local_candidate_id: Identifier for the local ICE candidate.
  • remote_candidate_id: Identifier for the remote ICE candidate.
  • state: Current state of the candidate pair (e.g., connected, disconnected).
  • priority: Priority value assigned to the candidate pair.
  • nominated: Indicates whether this candidate pair was nominated for use.
  • writable: Indicates if the candidate pair is writable.
  • packets_sent: Total number of packets sent.
  • packets_sent_per_s: Rate of packets sent per second.
  • bytes_sent: Total bytes sent.
  • bytes_sent_in_bits_per_s: Bitrate (in bits per second) of sent data.
  • packets_received: Total number of packets received.
  • packets_received_per_s: Rate of packets received per second.
  • bytes_received: Total bytes received.
  • bytes_received_in_bits_per_s: Bitrate (in bits per second) of received data.
  • total_round_trip_time: Cumulative round-trip time for the candidate pair.
  • total_round_trip_time_per_responses_received: Average round-trip time per response received.
  • current_round_trip_time: Most recent round-trip time measured.
  • available_outgoing_bitrate: Available outgoing bitrate for the connection.
  • requests_received: Number of connectivity check requests received.
  • requests_sent: Number of connectivity check requests sent.
  • responses_received: Number of connectivity check responses received.
  • responses_sent: Number of connectivity check responses sent.
  • consent_requests_sent: Number of consent requests sent.
  • packets_discarded_on_send: Count of packets discarded during send operations.
  • bytes_discarded_on_send: Total bytes discarded during send operations.
  • last_packet_received_timestamp: Timestamp of the last packet received.
  • last_packet_sent_timestamp: Timestamp of the last packet sent.

WebRTC Video Performance Metrics¶

Captures key video metrics collected from each connected client.

  • timestamp: The time when the metric was recorded.
  • ssrc: Synchronization source identifier for the media stream.
  • kind: The type of media (e.g., video).
  • transport_id: Identifier for the transport channel used.
  • codec_id: Unique identifier for the codec.
  • codec: Name of the codec used for encoding.
  • jitter: Variation in packet arrival times.
  • packets_lost: Number of packets that failed to arrive.
  • track_identifier: Unique identifier for the media track.
  • mid: Media stream identifier.
  • packets_received: Total count of packets received.
  • packets_received_per_s: Rate of packets received per second.
  • bytes_received: Total bytes of data received.
  • bytes_received_in_bits_per_s: Bitrate (in bits per second) of the received data.
  • header_bytes_received: Total header bytes received.
  • header_bytes_received_in_bits_per_s: Bitrate (in bits per second) for header bytes.
  • retransmitted_packets_received: Number of retransmitted packets received.
  • retransmitted_packets_received_per_s: Rate of retransmitted packets per second.
  • retransmitted_bytes_received: Total bytes received from retransmitted packets.
  • retransmitted_bytes_received_in_bits_per_s: Bitrate of retransmitted bytes.
  • rtx_ssrc: SSRC for the retransmission stream.
  • last_packet_received_timestamp: Timestamp marking the last received packet.
  • jitter_buffer_delay: Total delay introduced by the jitter buffer.
  • jitter_buffer_delay_per_jitter_buffer_emitted_count_in_ms: Average delay per jitter buffer emission (in ms).
  • jitter_buffer_target_delay: Configured target delay for the jitter buffer.
  • jitter_buffer_target_delay_per_jitter_buffer_emitted_count_in_m: Average target delay per emission.
  • jitter_buffer_minimum_delay: Configured minimum delay for the jitter buffer.
  • jitter_buffer_minimum_delay_per_jitter_buffer_emitted_count_in_: Average minimum delay per jitter buffer emission.
  • jitter_buffer_emitted_count: Number of times the jitter buffer emitted data.
  • frames_received: Total number of video frames received.
  • frames_received_per_s: Rate of video frames received per second.
  • frame_width: Width of the video frame in pixels.
  • frame_height: Height of the video frame in pixels.
  • frames_per_second: Reported frame rate of the video.
  • frames_decoded: Number of video frames decoded.
  • frames_decoded_per_s: Rate of decoded frames per second.
  • key_frames_decoded: Count of key frames that were decoded.
  • key_frames_decoded_per_s: Rate of key frames decoded per second.
  • frames_dropped: Number of video frames dropped.
  • total_decode_time: Total time spent decoding video frames.
  • total_decode_time_per_frames_decoded_in_ms: Average decode time per frame (in ms).
  • total_processing_delay: Cumulative delay due to processing.
  • total_processing_delay_per_jitter_buffer_emitted_count_in_ms: Average processing delay per jitter buffer emission (in ms).
  • total_assembly_time: Total time spent assembling frames from packets.
  • total_assembly_time_per_frames_assembled_from_multiple_packets_: Average assembly time per frame assembled from multiple packets.
  • frames_assembled_from_multiple_packets: Count of frames assembled from multiple packets.
  • total_inter_frame_delay: Sum of delays between consecutive frames.
  • total_inter_frame_delay_per_frames_decoded_in_ms: Average inter-frame delay per decoded frame (in ms).
  • total_squared_inter_frame_delay: Sum of squared inter-frame delays.
  • inter_frame_delay_st_dev_in_ms: Standard deviation of inter-frame delay (in ms).
  • pause_count: Number of pauses detected in the video stream.
  • total_pauses_duration: Total duration of all pauses.
  • freeze_count: Count of video freezes.
  • total_freezes_duration: Total duration of freezes.
  • decoder_implementation: Identifier for the decoder implementation used.
  • fir_count: Count of Full Intra Requests (FIR).
  • pli_count: Count of Picture Loss Indications (PLI).
  • nack_count: Count of Negative Acknowledgments (NACK).
  • goog_timing_frame_info: Additional timing frame information (Google-specific).
  • power_efficient_decoder: Indicates if a power-efficient decoder was used.
  • min_playout_delay: Minimum delay before video playback begins.

WebRTC Audio Performance Metrics¶

Captures key audio metrics collected from each connected client.

  • timestamp: The time when the metric was recorded.
  • ssrc: Synchronization source identifier for the audio stream.
  • kind: Type of media (audio).
  • transport_id: Identifier for the transport channel used.
  • codec_id: Unique identifier for the audio codec.
  • codec: Name of the audio codec employed.
  • jitter: Variation in the arrival times of audio packets.
  • packets_lost: Number of audio packets that did not arrive.
  • playout_id: Identifier for the audio playout instance.
  • track_identifier: Unique identifier for the audio track.
  • mid: Media stream identifier.
  • remote_id: Identifier for the remote endpoint.
  • packets_received: Total count of audio packets received.
  • packets_received_per_s: Rate of audio packets received per second.
  • packets_discarded: Count of audio packets discarded.
  • packets_discarded_per_s: Rate at which packets are discarded per second.
  • fec_packets_received: Number of Forward Error Correction (FEC) packets received.
  • fec_packets_received_per_s: Rate of FEC packets received per second.
  • fec_packets_discarded: Count of FEC packets discarded.
  • fec_packets_discarded_per_s: Rate of discarded FEC packets per second.
  • bytes_received: Total bytes of audio data received.
  • bytes_received_in_bits_per_s: Bitrate (in bits per second) of the received audio data.
  • header_bytes_received: Total header bytes for audio packets.
  • header_bytes_received_in_bits_per_s: Bitrate for the received header bytes.
  • last_packet_received_timestamp: Timestamp marking the last received audio packet.
  • jitter_buffer_delay: Total delay introduced by the jitter buffer.
  • jitter_buffer_delay_per_jitter_buffer_emitted_count_in_ms: Average delay per jitter buffer emission (in ms).
  • jitter_buffer_target_delay: Configured target delay for the jitter buffer.
  • jitter_buffer_target_delay_per_jitter_buffer_emitted_count_in_m: Average target delay per emission (in ms).
  • jitter_buffer_minimum_delay: Configured minimum delay for the jitter buffer.
  • jitter_buffer_minimum_delay_per_jitter_buffer_emitted_count_in_: Average minimum delay per jitter buffer emission.
  • jitter_buffer_emitted_count: Number of times the jitter buffer emitted audio packets.
  • total_samples_received: Total number of audio samples received.
  • total_samples_received_per_s: Rate of audio samples received per second.
  • concealed_samples: Count of audio samples concealed due to packet loss.
  • concealed_samples_per_s: Rate of concealed audio samples per second.
  • concealed_samples_per_total_samples_received: Ratio of concealed samples to the total samples received.
  • silent_concealed_samples: Number of silent audio samples concealed.
  • silent_concealed_samples_per_s: Rate of silent concealed samples per second.
  • concealment_events: Number of events where audio concealment occurred.
  • inserted_samples_for_deceleration: Samples inserted to slow down audio playback.
  • inserted_samples_for_deceleration_per_s: Rate of inserted samples for deceleration per second.
  • removed_samples_for_acceleration: Samples removed to speed up audio playback.
  • removed_samples_for_acceleration_per_s: Rate of removed samples for acceleration per second.
  • audio_level: Measured audio level.
  • total_audio_energy: Total energy of the received audio.
  • audio_level_in_rms: Audio level measured in Root Mean Square (RMS).
  • total_samples_duration: Cumulative duration of the received audio samples.
  • total_processing_delay: Total delay due to audio processing.
  • total_processing_delay_per_jitter_buffer_emitted_count_in_ms: Average processing delay per jitter buffer emission (in ms).
  • jitter_buffer_flushes: Number of times the jitter buffer was flushed.
  • delayed_packet_outage_samples: Count of audio samples affected by delayed packet outages.
  • relative_packet_arrival_delay: Relative delay in the arrival of audio packets.
  • interruption_count: Number of audio interruptions detected.
  • total_interruption_duration: Cumulative duration of all audio interruptions.

WebRTC Data Channel Metrics¶

Captures metrics about the WebRTC UDP data channel responsible for dispatching client input events to host.

  • timestamp: The time when the metric was recorded.
  • label: A descriptive label for the data message or event.
  • protocol: The protocol used for data channel communication.
  • data_channel_identifier: Unique identifier for the data channel.
  • state: Current state of the data channel (e.g., open, closed)
  • messages_sent: Total number of messages sent over the data channel.
  • messages_sent_per_s: Rate of messages sent per second.
  • bytes_sent: Total bytes sent over the data channel.
  • bytes_sent_in_bits_per_s: Bitrate (in bits per second) of sent data.
  • messages_received: Total number of messages received over the data channel.
  • messages_received_per_s: Rate of messages received per second.
  • bytes_received: Total bytes received over the data channel.
  • bytes_received_in_bits_per_s: Bitrate (in bits per second) of received data.

Supplementary Delay Measurement Metrics¶

Captures a metric that is used to calculate the real end-to-end latency for video frames.

  • timestamp: The time when the metric was recorded.
  • encode_to_decode_delay: The time it takes for a frame encoded by the host to reach the clients decoding pipeline.

End-to-End Latency Calculation¶

The following graph shows the lifecycle of a frame from the game engine of the Cost PC to Client's monitor

No description has been provided for this image

In [ ]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sqlalchemy import create_engine, func
from IPython.display import display, HTML
from dotenv import load_dotenv
import os
import plotly.express as plotlyx
load_dotenv()

DATABASE_URL = os.getenv('DATABASE_URL')
if not DATABASE_URL:
    raise ValueError("DATABASE_URL is not set in the environment.")

engine = create_engine(DATABASE_URL)
query = f"""
SELECT
			--dimensions
			R.ID AS RUN_ID,
			R.CLIENTS,
			CONCAT(R.WIDTH::TEXT, 'x', R.HEIGHT::TEXT) AS RESOLUTION,
			R.AUDIO,
			R.FRAME_RATE,
			R.HARDWARE,
			D.CLIENT_ID,
			PM.TIMESTAMP,

			--host metrics
			ROUND(
				EXTRACT(
					EPOCH
					FROM
						(PM.TIMESTAMP - R.GAME_STARTED_AT)
				)
			) AS SECONDS,
			PM.CPU_USAGE::DOUBLE PRECISION / 16 AS HOST_CPU,
			PM.NV_GPU_SM::DOUBLE PRECISION AS HOST_GPU,
			PM.NV_GPU_POWER::DOUBLE PRECISION AS HOST_GPU_POWER,
			PM.WORKING_SET_PRIVATE / 1048576 AS HOST_MEMORY_MB,
			PM.BYTES_SENT / 1024 AS HOST_KBYTES_SENT_PER_S,

			--client video metrics
			SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 4)::DOUBLE PRECISION - SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 2)::DOUBLE PRECISION AS CLIENT_VIDEO_RENDER_TO_ENCODED_LATENCY,
			D.DELAY::DOUBLE PRECISION AS CLIENT_VIDEO_ENCODED_TO_DECODING_LATENCY,
			SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 13)::DOUBLE PRECISION - SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 11)::DOUBLE PRECISION AS CLIENT_VIDEO_DECODING_TO_RENDER_LATENCY,
			SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 4)::DOUBLE PRECISION - SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 2)::DOUBLE PRECISION + D.DELAY::DOUBLE PRECISION + SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 13)::DOUBLE PRECISION - SPLIT_PART(V.GOOG_TIMING_FRAME_INFO, ',', 11)::DOUBLE PRECISION AS CLIENT_VIDEO_E2E_LATENCY,
			V.BYTES_RECEIVED_IN_BITS_PER_S / 8192::DOUBLE PRECISION AS CLIENT_VIDEO_KBYTES_RECEIVED_PER_S,
			V.FRAMES_RECEIVED_PER_S::DOUBLE PRECISION AS CLIENT_VIDEO_FRAMES_RECEIVED_PER_S,
			V.PACKETS_RECEIVED_PER_S::DOUBLE PRECISION AS CLIENT_VIDEO_PACKETS_RECEIVED_PER_S,
			V.FRAMES_DROPPED::DOUBLE PRECISION AS CLIENT_VIDEO_TOTAL_FRAMES_DROPPED,
			V.PACKETS_LOST::DOUBLE PRECISION AS CLIENT_VIDEO_TOTAL_PACKETS_LOST,
			V.JITTER_BUFFER_DELAY_PER_JITTER_BUFFER_EMITTED_COUNT_IN_MS::DOUBLE PRECISION AS CLIENT_PER_VIDEO_FRAME_JITTER_DELAY,
			V.TOTAL_DECODE_TIME_PER_FRAMES_DECODED_IN_MS::DOUBLE PRECISION AS CLIENT_PER_VIDEO_FRAME_DECODE_DELAY,
			V.TOTAL_ASSEMBLY_TIME_PER_FRAMES_ASSEMBLED_FROM_MULTIPLE_PACKETS_::DOUBLE PRECISION AS CLIENT_PER_VIDEO_FRAME_ASSEMBLY_DELAY,
			V.TOTAL_PROCESSING_DELAY_PER_JITTER_BUFFER_EMITTED_COUNT_IN_MS::DOUBLE PRECISION AS CLIENT_PER_VIDEO_FRAME_TOTAL_PROCESSING_DELAY,

			--client audio metrics
			A.BYTES_RECEIVED_IN_BITS_PER_S / 8192::DOUBLE PRECISION AS CLIENT_AUDIO_KBYTES_RECEIVED_PER_S,
			A.PACKETS_RECEIVED_PER_S::DOUBLE PRECISION AS CLIENT_AUDIO_PACKETS_RECEIVED_PER_S,
			A.PACKETS_LOST::DOUBLE PRECISION AS CLIENT_AUDIO_TOTAL_PACKETS_LOST,
			A.JITTER_BUFFER_DELAY_PER_JITTER_BUFFER_EMITTED_COUNT_IN_MS::DOUBLE PRECISION AS CLIENT_PER_AUDIO_FRAME_JITTER_DELAY,
			A.TOTAL_PROCESSING_DELAY_PER_JITTER_BUFFER_EMITTED_COUNT_IN_MS::DOUBLE PRECISION AS CLIENT_PER_AUDIO_FRAME_TOTAL_PROCESSING_DELAY,
            
			--client data channel metrics
			DA.BYTES_RECEIVED_IN_BITS_PER_S / 8192::DOUBLE PRECISION AS CLIENT_DATA_KBYTES_RECEIVED_PER_S,
			DA.BYTES_SENT_IN_BITS_PER_S / 8192::DOUBLE PRECISION AS CLIENT_DATA_KBYTES_SENT_PER_S,

			--client total metrics (video + audio + data)
			C.BYTES_RECEIVED_IN_BITS_PER_S / 8192::DOUBLE PRECISION AS CLIENT_TOTAL_KBYTES_RECEIVED_PER_S,
			C.BYTES_SENT_IN_BITS_PER_S / 8192::DOUBLE PRECISION AS CLIENT_TOTAL_KBYTES_SENT_PER_S,
			C.CURRENT_ROUND_TRIP_TIME * 1000::DOUBLE PRECISION AS CLIENT_CP_RTT
		FROM
			RUNS R
			LEFT JOIN PERFORMANCE_METRICS PM ON R.ID = PM.RUN_ID
			AND PM.TIMESTAMP > R.GAME_STARTED_AT + INTERVAL '2 second'
			AND PM.TIMESTAMP < R.GAME_STARTED_AT + INTERVAL '1 second' * R.DURATION - INTERVAL '2 second'
			LEFT JOIN DELAY_MEASUREMENTS D ON R.ID = D.RUN_ID
			AND ROUND(
				EXTRACT(
					EPOCH
					FROM
						(PM.TIMESTAMP - D.TIMESTAMP)
				)
			) = 0
			LEFT JOIN WEBRTC_VIDEO_METRICS V ON R.ID = V.RUN_ID
			AND V.CLIENT_ID = D.CLIENT_ID
			AND ROUND(
				EXTRACT(
					EPOCH
					FROM
						(PM.TIMESTAMP - V.TIMESTAMP)
				)
			) = 0
			LEFT JOIN WEBRTC_AUDIO_METRICS A ON R.ID = A.RUN_ID
			AND A.CLIENT_ID = D.CLIENT_ID
			AND ROUND(
				EXTRACT(
					EPOCH
					FROM
						(PM.TIMESTAMP - A.TIMESTAMP)
				)
			) = 0
			LEFT JOIN WEBRTC_CANDIDATE_PAIR_METRICS C ON R.ID = V.RUN_ID
			AND C.CLIENT_ID = D.CLIENT_ID
			AND ROUND(
				EXTRACT(
					EPOCH
					FROM
						(PM.TIMESTAMP - C.TIMESTAMP)
				)
			) = 0
			LEFT JOIN WEBRTC_DATA_METRICS DA ON R.ID = DA.RUN_ID
			AND DA.CLIENT_ID = D.CLIENT_ID
			AND ROUND(
				EXTRACT(
					EPOCH
					FROM
						(PM.TIMESTAMP - DA.TIMESTAMP)
				)
			) = 0
"""

all_data = pd.read_sql(query, engine)
In [7]:
aggregated_data = all_data.groupby([
    "run_id", 
    "clients", 
    "resolution", 
    "audio", 
    "frame_rate", 
    "hardware", 
    "client_id"
], dropna=False).agg({
    "host_cpu": "mean",
    "host_gpu": "mean",
    "host_gpu_power": "mean",
    "host_memory_mb": "mean",
    "host_kbytes_sent_per_s": "mean",
    "client_video_render_to_encoded_latency": "mean",
    "client_video_encoded_to_decoding_latency": "mean",
    "client_video_decoding_to_render_latency": "mean",
    "client_video_e2e_latency": "mean",
    "client_video_kbytes_received_per_s": "mean",
    "client_video_frames_received_per_s": "mean",
    "client_video_packets_received_per_s": "mean",
    "client_video_total_frames_dropped": "mean",
    "client_video_total_packets_lost": "mean",
    "client_per_video_frame_jitter_delay": "mean",
    "client_per_video_frame_decode_delay": "mean",
    "client_per_video_frame_assembly_delay": "mean",
    "client_per_video_frame_total_processing_delay": "mean",
    "client_audio_kbytes_received_per_s": "mean",
    "client_audio_packets_received_per_s": "mean",
    "client_audio_total_packets_lost": "mean",
    "client_per_audio_frame_jitter_delay": "mean",
    "client_per_audio_frame_total_processing_delay": "mean",
    "client_data_kbytes_received_per_s": "mean",
    "client_data_kbytes_sent_per_s": "mean",
    "client_total_kbytes_received_per_s": "mean",
    "client_total_kbytes_sent_per_s": "mean",
    "client_cp_rtt": "mean"
}).reset_index()

def align_yaxis(axes):
    y_coords = [ax.transData.transform((0, 0))[1] for ax in axes]
    y_target = sum(y_coords) / len(y_coords)
    for ax, y in zip(axes, y_coords):
        adjust_yaxis(ax, y_target - y)


def adjust_yaxis(ax, ydif):
    inv = ax.transData.inverted()
    _, dy = inv.transform((0, 0)) - inv.transform((0, ydif))
    
    miny, maxy = ax.get_ylim()
    if -miny > maxy or (-miny == maxy and dy > 0):
        nminy = miny
        nmaxy = miny * (maxy + dy) / (miny + dy)
    else:
        nmaxy = maxy
        nminy = maxy * (miny + dy) / (maxy + dy)
    ax.set_ylim(nminy, nmaxy)


px = 1/plt.rcParams['figure.dpi']
    
def visualize_performance_metrics(
    where_clause="", 
    title="Performance Metrics Visualization",
    subtitle="",
    additional_notes="",
    include_bytes_sent=False,
    include_delay=False,
    figsize=(1920*px, 1080*px),
    dimension_column="frame_rate",
    dimension_label="Frame Rate (FPS)"
):    
    metrics = ['host_cpu', 'host_gpu', 'host_memory_mb']
    if include_bytes_sent:
        metrics.append('host_kbytes_sent_per_s')
    if include_delay:
        metrics.append('client_video_e2e_latency')

    if where_clause:
        filtered_data = aggregated_data.query(where_clause)
    else:
        filtered_data = aggregated_data
    
    
    agg_dict = {metric: ['std', 'mean', 'count'] for metric in metrics}
    stats_df = filtered_data.groupby(dimension_column).agg(agg_dict)

    summary_dict = {
        'CPU Mean (%)': stats_df[('host_cpu', 'mean')],
        'CPU StdDev (%)': stats_df[('host_cpu', 'std')],
        
        'GPU Mean (%)': stats_df[('host_gpu', 'mean')],
        'GPU StdDev (%)': stats_df[('host_gpu', 'std')],
        
        'Memory Mean (MB)': stats_df[('host_memory_mb', 'mean')],
        'Memory StdDev (MB)': stats_df[('host_memory_mb', 'std')],
    }
    if include_bytes_sent:
        summary_dict['Data out/s Mean (KB)'] = stats_df[('host_kbytes_sent_per_s', 'mean')]
        summary_dict['Data out/s StdDev (KB)'] = stats_df[('host_kbytes_sent_per_s', 'std')]
    if include_delay:
        summary_dict['Latency Mean (ms)'] = stats_df[('client_video_e2e_latency', 'mean')]
        summary_dict['Latency StdDev (ms)'] = stats_df[('client_video_e2e_latency', 'std')]
    
    summary_dict['Sample Count'] = stats_df[('host_cpu', 'count')]
    
    summary_df = pd.DataFrame(summary_dict).reset_index().rename(columns={dimension_column: dimension_label})
    
    header_html = f"""
    <h1>{title}</h1>
    <p>{subtitle}</p>
    """
    display(HTML(header_html))

    axises = []
    

    fig, ax1 = plt.subplots(figsize=figsize)
    ax2 = ax1.twinx()
    axises.append(ax1)
    axises.append(ax2)
    if include_delay:
        ax3 = ax1.twinx()
        ax3.spines.right.set_position(("axes", 1.1))
        ax3.set_ylabel('Latency (ms)', color='black')
        axises.append(ax3)
    
    x = np.arange(len(summary_df))
    width = 0.1
    
    
    total_bars = 3
    if include_bytes_sent:
        total_bars += 1
    if include_delay:
        total_bars += 1
    
    bar_positions = np.linspace(-(total_bars-1)/2 * width*1.1, (total_bars-1)/2 * width*1.1, total_bars)
    

    ax1.bar(
        x + bar_positions[0],
        summary_df['CPU Mean (%)'],
        width,
        yerr=summary_df['CPU StdDev (%)'],
        capsize=4,
        label='CPU (%)',
        color='#a559aa'
    )
    
    ax1.bar(
        x + bar_positions[1],
        summary_df['GPU Mean (%)'],
        width,
        yerr=summary_df['GPU StdDev (%)'],
        capsize=4,
        label='GPU (%)',
        color='#59a89c'
    )
    
    ax2.bar(
        x + bar_positions[2],
        summary_df['Memory Mean (MB)'],
        width,
        yerr=summary_df['Memory StdDev (MB)'],
        capsize=4,
        label='Memory (MB)',
        color='#f0c571'
    )
    
    if include_bytes_sent:
        ax2.bar(
            x + bar_positions[3],
            summary_df['Data out/s Mean (KB)'],
            width,
            yerr=summary_df['Data out/s StdDev (KB)'],
            capsize=4,
            label='Data out/s (KB)',
            color='#e02b35'
        )

    if include_delay:
        ax3.bar(
            x + bar_positions[4],
            summary_df['Latency Mean (ms)'],
            width,
            yerr=summary_df['Latency StdDev (ms)'],
            capsize=4,
            label='Latency (ms)',
            color='#2066a8'
        )
    
    ax1.set_xticks(x)
    ax1.set_xticklabels(summary_df[dimension_label])
    
    ax1.set_xlabel(dimension_label)
    ax1.set_ylabel('CPU & GPU Usage (%)', color='black')
    
    if include_bytes_sent:
        ax2.set_ylabel('Memory (MB) & Data Out (KB)', color='black')
    else:
        ax2.set_ylabel('Memory (MB)', color='black')
    
    lines1, labels1 = ax1.get_legend_handles_labels()
    lines2, labels2 = ax2.get_legend_handles_labels()
    if include_delay:
        lines3, labels3 = ax3.get_legend_handles_labels()
        ax1.legend(lines1 + lines2 + lines3, labels1 + labels2 + labels3, loc='upper left')
    else:
        ax1.legend(lines1 + lines2, labels1 + labels2, loc='upper left')
    
    ax1.grid(True, axis='y', alpha=0.3)
    
    align_yaxis(axises)
    
    #plt.tight_layout()
    plt.show()
    
    display(summary_df)
    
    if additional_notes:
        note_html = f"""
        <p style='text-align: left; font-style: italic;'>
        {additional_notes}
        </p>
        """
        display(HTML(note_html))

    display(HTML("<p style=\"page-break-after:always;\"></p>"))
In [10]:
samples = aggregated_data.groupby([
    'clients', 'resolution', 'audio', 'frame_rate', 'hardware'
]).size().reset_index(name='sample_count')


samples['clients_label'] = samples['clients'].astype(str) + ' Clients'
samples['resolution_label'] = samples['resolution'].astype(str)
samples['hardware_label'] = samples['hardware'].apply(lambda x: 'HW enc' if x else 'SW enc')
samples['audio_label'] = samples['audio'].apply(lambda x: 'Has Audio' if x else 'No Audio')
samples['frame_rate_label'] = samples['frame_rate'].astype(str) + ' FPS'

fig = plotlyx.sunburst(
    samples,
    path=['clients_label', 'resolution_label', 'hardware_label', 'audio_label', 'frame_rate_label'],
    values='sample_count',
    title='Sample Count per Configuration',
    width=1000,
    height=1000
)
fig.update_traces(hovertemplate=('Configuration: %{id}<br>Samples: %{value}'))
fig.update_layout(title_x=0.5)
fig.show(renderer="notebook")
In [11]:
visualize_performance_metrics(
    where_clause="clients == 0",
    title="Baseline Metrics for Single Player Game runs",
    subtitle="The following data only includes runs with 0 clients (single player mode)<br>Only dimension that has effect on the results is the FPS setting",
)

Baseline Metrics for Single Player Game runs

The following data only includes runs with 0 clients (single player mode)
Only dimension that has effect on the results is the FPS setting

No description has been provided for this image
Frame Rate (FPS) CPU Mean (%) CPU StdDev (%) GPU Mean (%) GPU StdDev (%) Memory Mean (MB) Memory StdDev (MB) Sample Count
0 30 0.679967 0.069591 1.351515 0.825516 575.240729 1.509627 6
1 60 1.192219 0.199224 2.145455 0.022998 574.891773 1.612488 6

In [12]:
visualize_performance_metrics(
    where_clause="clients > 0",
    title="Multi-Client Runs (FPS vs. Host Metrics)",
    subtitle="The following data only includes runs with clients > 0.",
    additional_notes="The frame rate might not directly affect data transfer because the bitrate of the video encoder can be independent of the frame rate.",
    include_bytes_sent=True,
    include_delay=True
)

Multi-Client Runs (FPS vs. Host Metrics)

The following data only includes runs with clients > 0.

No description has been provided for this image
Frame Rate (FPS) CPU Mean (%) CPU StdDev (%) GPU Mean (%) GPU StdDev (%) Memory Mean (MB) Memory StdDev (MB) Data out/s Mean (KB) Data out/s StdDev (KB) Latency Mean (ms) Latency StdDev (ms) Sample Count
0 30 7.185515 9.254246 5.110426 2.079165 777.156975 164.818335 712.301004 245.076178 111.712728 25.953354 452
1 60 10.383856 10.419698 10.976929 5.055670 789.022840 170.525225 693.921914 240.500563 98.527559 24.708917 453

The frame rate might not directly affect data transfer because the bitrate of the video encoder can be independent of the frame rate.

In [13]:
visualize_performance_metrics(
    title="Host Metrics per number of Simultaneous Clients Served",
    additional_notes="Note: The CPU %, GPU %, Memory usage and data transfer scales linearly with the number of clients connected.",
    include_bytes_sent=True,
    include_delay=True,
    dimension_column="clients",
    dimension_label="Clients Connected"
)

Host Metrics per number of Simultaneous Clients Served

No description has been provided for this image
Clients Connected CPU Mean (%) CPU StdDev (%) GPU Mean (%) GPU StdDev (%) Memory Mean (MB) Memory StdDev (MB) Data out/s Mean (KB) Data out/s StdDev (KB) Latency Mean (ms) Latency StdDev (ms) Sample Count
0 0 0.936093 0.302996 1.748485 0.694200 575.066251 1.500326 1.625937 0.631848 NaN NaN 12
1 1 3.169951 2.066178 4.501983 1.960265 669.745016 56.351446 319.150368 17.944400 93.710979 11.146932 174
2 2 6.826664 5.595538 6.878535 3.241577 748.764220 116.951445 614.946055 52.165038 118.036831 15.639473 290
3 3 12.291237 12.450388 10.213925 5.421830 850.396922 192.765971 912.562723 108.851143 100.338403 31.441337 441

Note: The CPU %, GPU %, Memory usage and data transfer scales linearly with the number of clients connected.

In [14]:
visualize_performance_metrics(
    where_clause="clients > 0",
    title="Host Metrics per Resolution Setting",
    subtitle="The following data only includes runs with clients > 0.",
    additional_notes="Note: the resolution has no direct effect on data transfer, as the bitrate of the video is independently controlled by the video encoder.",
    include_bytes_sent=True,
    include_delay=True,
    dimension_column="resolution",
    dimension_label="Resolution"
)

Host Metrics per Resolution Setting

The following data only includes runs with clients > 0.

No description has been provided for this image
Resolution CPU Mean (%) CPU StdDev (%) GPU Mean (%) GPU StdDev (%) Memory Mean (MB) Memory StdDev (MB) Data out/s Mean (KB) Data out/s StdDev (KB) Latency Mean (ms) Latency StdDev (ms) Sample Count
0 1280x720 4.916077 3.204137 5.833656 3.105590 687.762246 56.466530 717.682651 236.017544 91.814920 25.301053 292
1 1920x1080 10.071777 11.300628 7.065345 3.153582 774.500276 125.519767 708.875493 244.824318 118.228192 23.691962 304
2 2560x1440 11.179369 11.633557 11.104105 5.937439 881.642846 213.566784 683.641415 246.717227 105.294696 22.530741 309

Note: the resolution has no direct effect on data transfer, as the bitrate of the video is independently controlled by the video encoder.

In [15]:
visualize_performance_metrics(
    where_clause="clients > 0",
    title="Host Metrics per Video Encoding Method",
    subtitle="The following data only includes runs with clients > 0.",
    additional_notes="<h3>Comparison of Hardware Encoding vs Software Encoding</h3><p>The preceeding table illustrates the results of using Nvidia NVENC vs software video encoder (cpu based): </p>\
    <ul>\
        <li>NVENC clearly reduces the CPU and Memory usage while increasing the GPU utilization</li>\
        <li>NVENC also reduces the data transfer/s as it uses h264 encoding instead of VP8 encoding</li>\
            </ul>",
    include_bytes_sent=True,
    include_delay=True,
    dimension_column="hardware",
    dimension_label="GPU Encoding"
)

Host Metrics per Video Encoding Method

The following data only includes runs with clients > 0.

No description has been provided for this image
GPU Encoding CPU Mean (%) CPU StdDev (%) GPU Mean (%) GPU StdDev (%) Memory Mean (MB) Memory StdDev (MB) Data out/s Mean (KB) Data out/s StdDev (KB) Latency Mean (ms) Latency StdDev (ms) Sample Count
0 False 14.789171 11.212713 7.273342 4.459553 887.096916 180.793892 747.407974 255.215277 109.011240 26.884519 453
1 True 2.770454 1.055936 8.822206 5.105815 678.865921 43.220192 658.696612 221.299264 101.235301 24.861710 452

Comparison of Hardware Encoding vs Software Encoding

The preceeding table illustrates the results of using Nvidia NVENC vs software video encoder (cpu based):

  • NVENC clearly reduces the CPU and Memory usage while increasing the GPU utilization
  • NVENC also reduces the data transfer/s as it uses h264 encoding instead of VP8 encoding

In [16]:
visualize_performance_metrics(
    where_clause="clients > 0",
    title="Host Metrics per Audio Streaming Enabled/Disabled",
    subtitle="The following data only includes runs with clients > 0.",
    additional_notes="<h3>Comparison of Audio streaming On/Off</h3><p>The preceeding table illustrates the results of streaming the audio together with the video</p>\
    <ul>\
        <li>Audio encoding increases CPU % to some degree as the encoding happens on the CPU</li>\
        <li>It has negligible effect on data transfer as the audio stream is encoded using ogg format with low bitrate.</li>\
            </ul>",
    include_bytes_sent=True,
    include_delay=True,
    dimension_column="audio",
    dimension_label="Audio Streaming"
)

Host Metrics per Audio Streaming Enabled/Disabled

The following data only includes runs with clients > 0.

No description has been provided for this image
Audio Streaming CPU Mean (%) CPU StdDev (%) GPU Mean (%) GPU StdDev (%) Memory Mean (MB) Memory StdDev (MB) Data out/s Mean (KB) Data out/s StdDev (KB) Latency Mean (ms) Latency StdDev (ms) Sample Count
0 False 7.925593 8.037016 8.080610 4.822343 777.485190 167.037463 693.115754 238.615666 104.174643 25.793855 452
1 True 9.645412 11.541699 8.013301 4.888179 788.695349 168.379408 713.064812 246.836755 106.078686 26.535280 453

Comparison of Audio streaming On/Off

The preceeding table illustrates the results of streaming the audio together with the video

  • Audio encoding increases CPU % to some degree as the encoding happens on the CPU
  • It has negligible effect on data transfer as the audio stream is encoded using ogg format with low bitrate.

In [17]:
seconds_data = all_data.query("clients == 1").groupby([
    "seconds"
], dropna=False).agg({
    "host_cpu": "mean",
    "host_gpu": "mean",
    "host_gpu_power": "mean",
    "host_memory_mb": "mean",
    "host_kbytes_sent_per_s": "mean",
    "client_video_e2e_latency": "mean",
}).reset_index()

fig, ax1 = plt.subplots(figsize=(1920*px, 1080*px))
ax2 = ax1.twinx()
ax3 = ax1.twinx()
ax3.spines.right.set_position(("axes", 1.1))
ax3.set_ylabel('Latency (ms)', color='black')

plots = [
    (ax1, [
        ('host_cpu', 'Host CPU Usage', 'blue', 1.2),
        ('host_gpu', 'Host GPU Usage', 'green', 1.2)
    ]),
    (ax2, [
        ('host_memory_mb', 'Host RAM Usage (MB)', 'red', 1.5),
        ('host_kbytes_sent_per_s', 'Host Bytes Sent (KB)', 'orange', 1.5)
    ]),
    (ax3, [
        ('client_video_e2e_latency', 'E2E Latency (ms)', 'purple', 1.2)
    ])
]

lines = []
for ax, metrics in plots:
    for col, label, color, _ in metrics:
        line, = ax.plot(seconds_data['seconds'], seconds_data[col], marker='o', color=color, label=label)
        lines.append(line)

ax1.set_xlabel('Seconds')
ax1.set_ylabel('CPU & GPU Usage (%)', color='black')
ax2.set_ylabel('Memory (MB) & Data Out (KB)', color='black')
plt.title('Performance Metrics Over Time')

ax1.legend(lines, [l.get_label() for l in lines], loc='upper left')


for ax in [ax1, ax2, ax3]:
    ax.set_ylim(bottom=0)

max_ax1 = max(seconds_data['host_cpu'].max(), seconds_data['host_gpu'].max())
ax1.set_ylim(0, max_ax1 * 1.2)

max_ax2 = max(seconds_data['host_memory_mb'].max(), seconds_data['host_kbytes_sent_per_s'].max())
ax2.set_ylim(0, max_ax2 * 1.5)

max_ax3 = seconds_data['client_video_e2e_latency'].max()
ax3.set_ylim(0, max_ax3 * 1.2)


header_html = f"""
<h1>Time Series Analysis of Gameplay Metrics</h1>
<p>The figure below illustrates the mean values of various metrics recorded during single client gameplay sessions, with each metric computed on a per-second basis.</p>
<p>It can be seen that metrics remain relatively stable over the entire game session.</p>
"""
display(HTML(header_html))


plt.tight_layout()
plt.show()

Time Series Analysis of Gameplay Metrics

The figure below illustrates the mean values of various metrics recorded during single client gameplay sessions, with each metric computed on a per-second basis.

It can be seen that metrics remain relatively stable over the entire game session.

No description has been provided for this image